Extended SAX: Extension of Symbolic Aggregate Approximation for Financial Time Series Data Representation
نویسندگان
چکیده
Efficient and accurate similarity searching for a large amount of time series data set is an important but non-trivial problem. Many dimensionality reduction techniques have been proposed for effective representation of time series data in order to realize such similarity searching, including Singular Value Decomposition (SVD), the Discrete Fourier transform (DFT), the Adaptive Piecewise Constant Approximation (APCA), and the recently proposed Symbolic Aggregate Approximation (SAX). In this work we propose a new extended approach based on SAX, called Extended SAX in order to realize efficient and accurate discovering of important patterns, necessary for financial applications. While the original SAX approach allows a very good dimensionality reduction and distance measures to be defined on the symbolic approach, SAX is based on PAA (Piecewise Aggregate Approximation) representation for dimensionality reduction that minimizes dimensionality by the mean values of equal sized frames. This value based representation causes a high possibility to miss some important patterns in some time series data such as financial time series data. Extended SAX, proposed in the paper, uses additional two new points, that is, max and min points, in equal sized frames besides the mean value for data approximation. We show that Extended SAX can improve representation preciseness without losing symbolic nature of the original SAX representation. We empirically compare the Extended SAX with the original SAX approach and demonstrate its quality improvement.
منابع مشابه
An improvement of symbolic aggregate approximation distance measure for time series
Symbolic Aggregate approXimation (SAX) as a major symbolic representation has been widely used in many time series data mining applications. However, because a symbol is mapped from the average value of a segment, the SAX ignores important information in a segment, namely the trend of the value change in the segment. Such a miss may cause a wrong classification in some cases, since the SAX repr...
متن کاملA Symbolic Representation Method to Preserve the Characteristic Slope of Time Series
In recent years many studies have been proposed for knowledge discovery in time series. Most methods use some technique to transform raw data into another representation. Symbolic representations approaches have shown effectiveness in speedup processing and noise removal. The current most commonly used algorithm is the Symbolic Aggregate Approximation (SAX). However, SAX doesn’t preserve the sl...
متن کاملA SAX-GA approach to evolve investment strategies on financial markets based on pattern discovery techniques
This paper presents a new computational finance approach, combining a Symbolic Aggregate approXimation (SAX) technique together with an optimization kernel based on genetic algorithms (GA). The SAX representation is used to describe the financial time series, so that, relevant patterns can be efficiently identified. The evolutionary optimization kernel is here used to identify the most relevant...
متن کاملSatellite Images Analysis with Symbolic Time Series: A Case Study of the Algerian Zone
Satellite Image Time Series (SITS) are an important source of information for studying land occupation and its evolution. Indeed, the very large volumes of digital data stored, usually are not ready to a direct analysis. In order to both reduce the dimensionality and information extraction, time series data mining generally gives rise to change of time series representation. In an objective of ...
متن کاملParticle Swarm Optimization of Information-Content Weighting of Symbolic Aggregate Approximation
Bio-inspired optimization algorithms have been gaining more popularity recently. One of the most important of these algorithms is particle swarm optimization (PSO). PSO is based on the collective intelligence of a swam of particles. Each particle explores a part of the search space looking for the optimal position and adjusts its position according to two factors; the first is its own experienc...
متن کامل